A
Comparison
of
Push
and
Pull
Techniques
for
AJAX
Engin
Bozdag
Ali
Mesbah
Arie
van
Deursen
Delft
University
of
Technology
Delft
University
of
Technology
Delft
Univ.
of
Technology
&
CWI
The
Netherlands
The
Netherlands
The
Netherlands
v.e.bozdag@student.tudelft.nl
A.Mesbah
@
tudelft.nl
Arie.
vanDeursen
@tudelft.nl
Abstract
is
the
push-based
style,
where
the
clients
subscribe
to
their
topic
of
interest,
and
the
server
publishes
the
changes
to
the
AJAX
applications
are
designed
to
have
high
user
interactiv-
cinsaycrnul
vr
ieissaecags
ity
and
low
user-perceived
latency.
Real-time
dynamic
web
The
rec
reeduof
eb
2.0
applicationsd
data
such
as
news
headlines,
stock
tickers,
and
auction
up-
(
hrono
Javacrip
and
XL
[7]pis
desged
Ave
dates
need
to
be
propagated
to
the
users
as
soon
as
possi-
high
uero
interaSctiv
t
and
low
user-percesiedlte
[13]
ble.
However,
AJAX
still
suffers
from
the
limitations
of
the
Intro
interpushvsty
intoA
sstems
[10c
f
Web's
request/response
architecture
which
prevents
servers
troimo
the
ponsivenes
of
s
tios
toward
from
pushing
real-time
dynamic
web
data.
Such
applications
end
users.
t
usually
use
a
pull
style
to
obtain
the
latest
updates,
where
the
However,s
client
actively
requests
the
changes
based
on
a
predefined
pliction
iso
t l
m
ainly
due
tshe
solitions
fo
teb
interval.
It
is
possible
to
overcome
this
limitation
by
adopt-
pl
protocol.
This
research
expo
the
fnm
eta
l
lim-
ing
a
push
style
of
interaction
where
the
server
broadcasts
iTs
o
browser-ba
sedap
c
eplrod
real-timedta.
data
when
a
change
occurs
on
the
server
side.
Both
these
We
orebhows
rea-time
ennticatio
n
can-be
ato
optins
avether
ow
trde-ffs
Ths
paer
xplresthe
We
explore
how
real-time
event
notification
can
be
added
to
options
have
their
own
trade-offs.
This
paper
explores
the
toa'AJXecnlgadcmprthpulndusa-
fundamental
limits
of
browser-based
applications
and
ana
ptoday's
AJAX
technology
and
compare
the
pull
and
push
ap-
lyzes
push
solutions
for
AJAX
technology.
It
also
shows
the
actua
t
-
ofdech
approach.
results
of
an
empirical
study
comparing
push
and
pull.acultdeofofahaprc.
This
paper
is
further
organized
as
follows.
Section
2
shows
current
techniques
to
implement
HTTP
based
push
1.
Introduction
and
discusses
the
BAYEUX
protocol
[17],
which
tries
to
bring
a
standard
to
HTTP
push.
Section
3
explains
our
setup
for
The
classical
style
of
the
web
called
REST
(Representational
the
push-pull
experiment.
Section
4
presents
the
results
of
State
Transfer)
[5]
requires
all
communication
between
the
the
empirical
study
involving
push
and
pull.
Section
5
dis-
browser
and
the
server
to
be
initiated
by
the
client,
i.e.,
the
cusses
the
results
of
the
study.
Section
6
summarizes
related
end
user
clicks
on
a
button
or
link
and
thereby
requests
a
work
on
this
area.
Finally,
Section
7
ends
this
paper
with
new
page
from
the
server.
In
this
scheme,
each
interaction
concluding
remarks.
between
the
client
and
the
server
is
independent
of
the
other
interactions.
No
'permanent'
connection
is
established
be-
tween
the
client
and
the server
maintains
no
state
informa-
2.
Web-based
Real-time
Event
tion
about
the
clients.
This
scheme
helps
scalability,
but
pre-
Notification
cludes
servers
from
sending
asynchronous
notifications.
There
are,
however,
many
use
cases
where
it
is
important
2.1.
AJAX
to
update
the
client-side
interface
as
soon
as
possible
in
re-
sponse
to
server-side
changes.
An
auction
web
site
where
AJAX
[7]
is
an
approach
to
web
application
development
the
users
needs
to
be
averted
that
another
bidder
has
made
utilizing
a
combination
of
established
web
technologies:
a
higher
bid,
a
stock
ticker,
a
news
portal,
or
a
chat-room
standards-based
presentation
using
XHTML
and
CSS,
dy-
where
new
messages
are
sent
immediately
to
the
subscribers,
namic
display
and
interaction
using
the
Document
Object
are
all
examples
of
such
use
cases.
Model,
data
interchange
and
manipulation,
asynchronous
Today,
such
web
applications
requiring
real-time
event
data
retrieval
using
XMLHttpRequest,
and
JavaScript
bind-
notifications
are
usually
implemented
using
a
pull
style,
ing
everything
together.
XMLHttpRequest
is
an
API
imple-
where
the
client
component
actively
requests
the
state
mented
by
most
modern
web
browser
scripting
engines
to
changes
using
client-side
timeouts.
An
alternative
to
this
transfer
data
to
and
from
a
web
server
using
HTTP,
by
estab-
1-4244-1450-4/07/$25.OO
©
2007
IEEE
15
lishing
an
independent
communication
channel
in
the
back-
the
response
of
a
long-lived
HTTP
connection.
Most
web
ground
between
a
web
client
and
server.
servers
do
some
processing,
send
back
a
response,
and
im-
It is
the
combination
of
these
technologies
that
enables
mediately
exit.
But
in
this
pattern,
the
connection
is
kept
us
to
adopt
principal
software
engineering
paradigms,
such
open
by
running
a
long
loop.
The
server
script
uses
event
as
component-
and
event-based,
for
web
application
devel-
registration
or
some
other
technique
to
detect
any
state
opment.
Our
earlier
work
[13]
on
an
architectural
style
for
changes.
As
soon
as
a
state
change
occurs,
it
streams
the
new
AJAX,
called
SPIAR,
gives
an
overview
of
the
new
way
web
data
and
flushes
it,
but
does
not
actually
close
the
connec-
applications
can
be
architected
using
AJAX.
Adopting
AJAX
tion.
Meanwhile,
the
browser
must
ensure
the
user-interface
has
become
a
serious
option
not
only
for
newly
developed
reflects
the
new
data,
while
still
waiting
for
response
from
applications,
but
also
for
migrating
[14]
existing
web
sites
the
server
to
finish.
to
increase
the
responsiveness.
The
evolution
of
web
and
the
advent
of
Web
2.0,
and
AJAX
in
particular,
is
making
the
Service
Streaming
users
experience
similar
to
using
a
desktop
application.
Well
S
known
examples
include
Gmail,
and
the
new
version
of
Ya-
This
time,
it
is
an
XMLHttpRequest
connection
that
is
long-
hoo!
Mail.
lived
in
the
background,
instead
of
the
initial
page
load.
This
The
REST
style
makes
a
server-initiated
HTTP
request
brings
some
flexibility
regarding
the
length
and
frequency
of
impossible.
Every
request
has
to
be
initiated
by
a
client,
connections.
The
page
will
be
loaded
normally
(one
time),
precluding
servers
from
sending
asynchronous
notifications
and
streaming
can
be
performed
with
a
predefined
lifetime
without
a
request
from
the
client
[
1].
There
are
several
solu
for
connection.
The
server
will
loop
indefinitely
just
like
in
page
streaming,
the
browser
has
to
read
the
latest
response
tions
used
in
the
practice
that
still
allow
the
client
to
receive
rpaesteamin
th
broser
has
toea
t
(near)
real-time
updates
from
the
server.
In
this
section
we
will
analyze
some
of
such
solutions.
2.4.
COMET
and
the
BAYEUX
Protocol
2.2.
HTTP
Pull
The
application
of
the
Service
Streaming
scheme
under
Most
AJAX
applications
check
with
the
server
at
regular
AJAX
is
now
known
as
Reverse
AJAX
or
COMET
[16].
user-definable
intervals
known
as
Time
to
Refresh
(TTR).
COMET
enables
the
server
to
send
a
message
to
the
client
This
check
occurs
blindly
regardless
of
whether
the
state
of
when
the
event
occurs,
without
the
client
having
to
explicitly
the
applications
has
changed.
request.
In
order
to
achieve
high
data
accuracy
and
data
freshness,
As
a
response
to
the
lack
of
communication
standards
the
pulling
frequency
has
to
be
high.
This,
in
turn,
induces
[13]
for
AJAX
applications,
the
Cometd
group'
released
a
high
network
traffic
and
possibly
unnecessary
messages.
The
COMET
protocol
draft
called
BAYEUX
[17].
The
BAYEUX
application
also
wastes
some
time
querying
for
the
comple-
message
format
is
defined
in
JSON
(JavaScript
Object
Nota-
tion
of
the
event,
thereby
directly
impacting
the
responsive-
tion)2
which
is
a
data-interchange
format
based
on
a
subset
ness
to
the
user.
Ideally,
the
pulling
interval
should
be
equal
of
the
JavaScript
Programming
Language.
The
protocol
has
to
the
Publish
Rate
(PR),
i.e.,
rate
at
which
the
state
changes.
recently
been
implemented
and
included
in
a
number
of
web
If
the
frequency
is
too
low,
the
client
can
miss
some
updates.
servers
including
Jetty3
and
IBM
Websphere.
This
proto-
This
scheme
is
frequently
used
in
web
systems,
since
it
col
currently
provides
a
connection
type
called
Long
Polling
is
robust,
simple
to
implement,
allows
for
offline
operation,
for
HTTP
push,
which
is
implemented
in
Jetty's
Cometd
li-
and
scales
well
to
high
number
of
subscribers
[8].
Mecha-
brary4.
nisms
such
as
Adaptive
TTR
[3]
allow
the
server
to
change
Long
Polling
(also
known
as
Asynchronous-Polling)
is
a
the
TTR,
so
that
the
client
can
pull
on
different
frequencies,
mixture
of
pure
server
push
and
client
pull.
After
a
sub-
depending
on
the
change
rate
of
the
data.
This
dynamic
TTR
scription
to
a
channel,
the
connection
between
the
client
and
approach
in
turn
provides
better
results
than
a
static
TTR
the
server
is
kept
open,
for
a
defined
period
of
time
(by
de-
model
r18].
However,
it
will
never
reach
complete
data
ac-
fault
45
seconds).
If
no
event
occurs
on
the
server
side,
a
curacy,
and
it
will
create
unnecessary
traffic.
timeout
occurs
and
the
server
asks
the
client
to
reconnect
2.3.
HTTP
Streaming
asynchronously.
If
an
event
occurs,
the
server
sends
the
data
to
the
client
and
the
client
reconnects.
HTTP
Streaming
is
a
basic
and
old
method
that
was
intro-
This
protocol
follows
the
'topic-based'
[4]
publish-
duced
on
the
web
first
in
1992
by
Netscape,
under
the
name
subscribe
scheme,
which
groups
events
according
to
their
'dynamic
document'
[15].
HTTP
Streaming
comes
in
two
__________
forms
namely,
Page
and
Service
Streaming.
1
http:
//www.
cometd.
corn
2
http:
//www
.j
son
.org
Page
Streaming
3~
http://www.rnortbay.org
This
method
simply
consists
of
streaming
server
data
in
4http:
//www.mnortbay.
org
16
Das3
SupercGmprer
121
ushAlata)
PPIlcatlon
Prover
XA
1*
gathering
data
and
measuring:
the
mean
time
it
takes
)
sand
(ata)
for
clients
to
receive
a
new
published
message,
the
load
ifltl
Hode6A
on
the
server,
number
of
messages
sent
or
retrieved,
the
effects
of
changing
the
data
publish
rate
and
number
of
if
[3]
s
end
s{
is
icse
users,
. r r i:
~~~~~~~~~~~~~Thread
server
.
analyzing
and
explaining
the
measurements
found.
[41
nsften
atistlQ
-
-
q
Statstic
meBssage
To
see
how
the
application
server
reacts
to
different
con-
Figure
1.
Experimental
Environment
ditions,
we
use
different
combinations
of
three
variables:
*
Number
of
concurrent
users
(100,
200,
350,
500,
and
1000).
The
variation
helps
to
find
a
maximum
num-
topic
(name)
and
map
individual
topics
to
distinct
commu-
ber
of
users
the
server
can
handle
simultaneously
and
nication
channels.
Participants
subscribe
to
individual
top-
1000
seemed
to
be
the
upper-bound
for
our
test.
This
is
ics,
which
are
identified
by
keywords.
Like
many
modern
because
the
server
was
already
running
on
100%
CPU
topic-based
engines,
BAYEUX
offers
a
form
of
hierarchical
with
1000
users.
We
also
tried
2000
and
5000
users,
addressing,
which
permits
programmers
to
organize
topics
however
the
server
was
so
saturated
that
it
was
not
able
according
to
containment
relationships.
It
also
allows
topic
to
send
any
updates
anymore.
names
to
contain
wildcards,
which
offers
the
possibility
to
*
Publish
interval
(5,
10, 15,
and
50
seconds):
The
fre-
subscribe
and
publish
to
several
topics
whose
names
match
a
quency
of
the
publishing
updates
is
also
important.
Be-
given
set
of
keywords.
BAYEUX
defines
the
following
phases
cause
of
the
long
polling
implementation
in
BAYEUX
in
order
to
establish
a
COMET
connection:
(See
Section
2),
the
system
should
act
more
like
pure
1.
Client
performs
a
handshake
with
the
server,
receives
a
pull
when
the
publish
interval
is
small,
and
more
like
client
iaditfuptcoepure
push
when
it
is
bigger.
We
chose
the
interval
50
clint-pidland
listcof.supported)connection.types(IFram
seconds,
because
the
client
timeout
of
BAYEUX
proto-
long-polling,
etc.).
col
is
45
seconds,
and
we
expect
this
interval
to
cause
2.
Client
sends
a
connection
request
with
its
id
and
its
pre-
man
isconns,
hnce
affect
the
performance
ferred
connection
type.
~~~many
disconnects,
hence
affecting
the
performance.
ferred
connection
type.
~~~*
Push
or
Pull:
We
also
made
an
option
in
our
test
script
3.
Client
later
subscribes
to
a
channel
and
receives
updates
t
aus
to
sw
beten
pull
an
push
t
that
allowed
us
to
switch
between
pull
and
push.
To
In
the
remainder
of
this
paper,
we
will
use
BAYEUX
as
the
make
the
total
number
of
combinations
smaller,
we
set
protocol
for
server
push,
and
compare
its
performance
with
the
pull
interval
as
15
seconds.
a
pure
pull
based
solution.
*
Total
number
of
messages:
For
each
combination,
we
generated
a
total
of
10
publish
messages.
3.
Experimental
Design
3.2.
Tools
In
this
section
we
will
present
our
experimental
setup.
In
order
to
simulate
a
high
number
of
clients,
we
evaluated
several
open
source
solutions.
Grinder5
seemed
to
be
a
good
3.1.
Goals
and
Setup
option,
providing
an
internal
TCPProxy,
allowing
to
record
The
goals
of
our
experiment
consist
of
exploring
the
actual
events
sent
by
the
browser
and
later
replay
them.
It
also
pro-
performance
trade-offs
of
a
COMET
push
implementation
vided
scripting
support,
which
allowed
us
to
create
a
script
and
compare
it
to
a
pure
pull
approach
on
the
web
by
con
that
simulates
a
browser
connecting
to
the
push
server,
sub-
ducting
a
controlled
empirical
study.
The
experiment
has
scribing
to
a
particular
stock
channel
and
receiving
push
data
to
be
repeatable
for
push
and
pull
but
also
for different
in-
continuously.
In
addition,
Grinder
has
a
built-in
feature
that
put
variables
such
as
number
of
users,
number
of
published
allows
us
to
create
multiple
threads
of
a
simulating
script.
Because
of
the
distributed
nature
of
the
simulated
clients
messages
and
intervals.
We
aim
at
achieving
these
goals
by:
on
different
nodes,
we
used
Log4J's
SocketServer6
to
set
up
a
logging
server
that
listens
for
incoming
log
mes-
*
creating
a
push
application
consisting
of
the
client
and
sages.
The
clients
then
send
the
log
messages
using
the
the
server
parts,
SocketAppender.
*
creating
the
same
application
for
pull,
We
used
TCPDump7
to
record
the
number
of
TCP
*
creating
an
application
which
publishes
a
variable
num-
(HTTP)
packets
sent
to
and
from
the
server.
We
also
cre-
ber
of
data
items
at
certain
intervals,5
/
*
mimicking
many
concurrent
web
clients
operating
on
6
http:
//logging.
apache
.org/log4
j/docs/
each
application,
7
http://www.tcpdump.org/
17
ated
a
script
that
uses
the
UNIX
top
utility8
to
record
the
server,
because
it
is
the
only
open-source
Java
EE
application
CPU
usage
of
the
application
server.
This
was
necessary
to
server
that
currently
implements
the
COMET
BAYEUX
pro-
observe
the
scalability
and
performance
of
each
approach.
tocol.
Jetty
uses
Java's
new
10
package
(NIO).
NIO
package
follows
the
event-driven
design,
which
allows
the
processing
3.3.
Sample
Application
of
each
task
as
a
finite
state
machine
(FSM).
As
the
num-
In
order
to
respond
to
publish
events
and
create
client-side
ber of
tasks
reach
a
certain
limit,
the
excess
tasks
are
ab-
processing,
we
developed
a
Stock
Ticker
application.
sorbed
in
the
server's
event
queue.
The
throughput
remains
constant
and
the
latency
shows
a
linear
increase.
The
Event-
The
Push
version
consists
of
a
JSP
page
which
uses
Dojo's
driven
design
is
supposed
to
perform
significantly
better
than
Cometd
library9
to
subscribe
to
a
channel
and
receive
the
thread-concurrency
model
[20,
21].
Stock
data.
We
use
Rico10
to
give
color
effects
to
different
theaconnectiviy
bet
the
s
a
A
d
data
values
on
the
web
interface.
For
the
server
side,
we
through
a
100
Mbps
ethernet
connection.
developed
a
Java
Servlet
(PushServlet)
that
pushes
the
data
into
the
browsers
using
the
Cometd
library.
The
PushServlet
3.5.
Sequence
of
events
manages
the
client
connections,
receives
data
from
back-end,
and
publishes
it
to
the
clients.
A
routine
test
run
consists
of
the
following
steps
(See
Fig-
The
pull
version
has
also
one
JSP
page,
but
instead
of
ure
1):
Cometd,
it
uses
the
normal
bind
method
of
Dojo
to
request
data
from
the
server.
The
pull
nature
was
set
using
the
1.
The
Service
Provider
publishes
the
stock
data
to
the
ap-
standard
set
Interval
JavaScript
method.
On
the
server,
plication
server
via
an
HTTP
POST
request,
in
which
a
PullServlet
was
made
which
keeps
an
internal
stock
object
the
creation
date,
the
stock
item
id,
and
the
stock
data
(the
most
recent
one)
and
simply
handles
every
incoming
re-
are
specified.
quest
the
usual
way.
2.
For
push:
The
application
server
pushes
the
data
to
all
the
subscribers
of
that
particular
stock.
For
pull:
the
The
Service
Provider
Java
application
was
created
which
applcaio
ver
ua
thepinter
stock
object,
so
uses
the
HTTPClient
libraryll
to
publish
stock
data
to
the
thatwhn
send
pulteq
the
y
gt
theelatest
Servlets.
The
number
of
publish
messages
as
well
as
the
in-
dtata
terval
at
which
the
messages
are
published
are
configurable.
3.
Each
simulated
client
logs
the
responses
(after
some
Simulating
clients
To
simulate
many
concurrent
clients
we
calculation)
and
sends
it
to
the
statistics
server.
Grinder
use
the
TCPProxy
to
record
the
actions
of
the
JSP/Dojo
also
processes
the
data
from
each
client
and
sends
the
client
pages
for
push
and
pull
and
create
scripts
for
each
in
statistics,
such
as
response
time,
to
the
statistics
server,
Jython12.
Jython
is
an
implementation
of
the
high-level,
dy-
which
runs
on
a
separate
machine.
namic,
object-oriented
language
Python,
integrated
with
the
Java
platform.
It
allows
the
usage
of
Java
objects
in
a
Python
It
is
worth
noting
that
we
use
a
combination
of
the
64
script
and
is
used
by
Grinder
to
simulate
web
users.
In
our
DAS3
nodes
and
Grinder
threads
to
simulate
different
num-
tests,
Jython
scripts
are
actually
imitating
the
JSP/Dojo
client
bers
of
users.
pages.
3.4.
Testing
Environment
3.6.
Data
Analysis
We
use
the
Distributed
ASCI
Supercomputer
3
(DAS3)13
to
We
created
a
Data
Analyzer
that
reads
the
data
from
Grinder
run
various
numbers
of
web
clients
on
different
distributed
and
Logging
Server
logs
and
writes
all
the
info
into
a
nodes.
The
DAS3
cluster
at
the
Delft
University
consists
database
using
Hibernate14.
This
way,
different
views
of
the
of
68
dual-CPU
2.4
GHz
AMD
Opteron
DP
250
compute
data
can
be
obtained
easily
using
queries
to
the
database.
nodes,
each
having
4
GB
of
memory.
The
cluster
is
equipped
with
1
and
10
Gigabit/s
Ethernet,
and
runs
Scientific
Linux
4.
4.
Results
The
application
server
runs
on
a
Pentium
IV,
3
Ghz
(Hy-
perthreading)
machine
with
1
Gb
memory,
and
Linux
Fedora
as
its
Operating
System.
We
use
Jetty
6.1.2
as
our
application
In
the
following
subsections,
we
present
the
results
which
__http___www_unixtop_org/we
obtained
using
the
combination
of
variables
mentioned
8
http://www.unixtop.org/
in
3.1.
Figures
2-5
depict
the
results.
Note
that
for
each
http://do
jotoolkit.org/
number
of
clients
on
the
x-axis,
the
five
publish
intervals
in
10
http: //www.
openrico
.
org/
11
http://
jakarta.apache.org/commons/httpc1ient/
seconds
(5,
10,
15,
20,
50)
are
presented.
12
http:
//www
.jython
.org
13
http:
//www.
cs
.vu.n1/das3/overview.
shtm1
14
http:
//www.hibernate
.org
18
Pull,
i
nterval
15
s
Puill,
interval
15
s
2..5
25=000000.001
E
1.
225000
1E
17500
90OO00000g150105
0
1.
15=0
g
m
15.5
15,555
|||ll1
I
I
iX1.
u,12,555
12.
Pf
e
mr^
l
l
l
.
^
.
2~~~.f
=UY**
.
11=
.
1
5,~505
1
2,505
11
200
350
550
1000
155
255
3550
555
155
0tci.
Of
clients
5M
525
Puash
O
750~
~~[1
11
1
01
20
.0
50
Ptgh|
0||
|
1,755
I
E
1
500
i
w05
1,550
Li
Iis2|I
I
2~~~~~~~~~0
500
I
II
I
I
155
200
OjS
S
0lOS
5
500
q
g
|
Ij
lg
1
g
g
11
g
|
g
g
|
g
-
no
200
3s
50
loco
1255
0
g
Ij gj
g | g g | g g 1 g g | g g i
of
clients
0
tllIlt[E
....................................l
10
5
*
1
01
1
20
0
*
50
1
01:1
200:1121
350
500
1555
cOf
hlests
.io
1
2
Figure
3.
Server
application
CPU
usage.
Figure
2.
Mean
publish
triptime.
non-unique
publish
items
versus
the
total
number
of
clients,
for
both
push
and
pull.
Note
that
if
a
pull
client
makes
a
re-
4.1.
Publish
triptime
quest
while
there
is
no
new
data,
it
will
receive
the
same
item
multiple
times.
This
way
a
client
might
receive
more
than
10
We
define
triptime
as
follows:
messages.
Triptime
=
Data
Creation
Date
-
Data
Receipt
Date
Data
Creation
Date
is
the
date
on
the
Service
Provider
4.4.
Received
Unique
Publish
Messages
(Publisher)
the
moment
it
creates
a
message,
and
Data
Re-
It
is
also
interesting
to
see
how
many
of
the
10
messages
we
ceipt
Date
is
the
date
on
the
client
the
moment
it
receives
have
published
reach
the
clients.
This
way
we
can
determine
the
message.
Triptime
shows
how
long
it
takes
for
a
publish
if
the
clients
miss
any
publish
items.
Figure
5
shows
the
message
to
reach
the
client
and
can
be
used
to
find
out
how
mean
number
of
received
unique
publish
items
versus
total
fast
the
client
gets
notified
with
the
latest
events.
Note
that
number
of
clients.
it
is
very
important
to
synchronize
the
datetime
for
both
the
Service
Provider
and
the
clients.
Figure
2
shows
the
mean
publish
triptime
versus
the
total
5.
Discussion
number
of
clients,
for
both
pull
and
push
techniques.
4.2.
Server
Performance
5.1.
Data
Coherence
We
define
a
piece
of
data
as
coherent,
if
the
data
on
the
server
Sic
us
ssttfl,w
xpc
t
ohvesm
amns
and
the
client
iS
synchronized.
We
check
the
data
coherence
tration
costs
on
the
server
side,
using
resources.
In
order
to
of
th
alroaches
by
measurin
thetr
tie
asw
c
an
compare
this
with
pull,
we
measured
the
CPU
usage
for
both
y
g
approaches.
FigureshowsthemenserveCPUuageas
see
in
Figure
2,
the
triptime
is,
at
most,
1,750
milliseconds
approaches.of
Figure
3gsows
the
meshand
serveruCPUlusag
with
push.
In
Pull,
this
can
go
up
to
25
seconds.
This
shows
the
number
of
lientsgro,forpushandpull.us
that
pull
is
not
as
responsive
as
push,
and
if
we
need
high
4.3.
Received
Publish
Messages
data
coherence,
we
should
always
choose
the
push
approach.
In
Figure
2
we
also
see
that
with
1000
users
and
a
publish
in-
To
see
how
pull
compares
to
pure
push
in
message
over-
terval
of
50
seconds,
the
triptime
increases
noticeably.
With
head,
we
published
a
total
of
10
messages
and
we
counted
such
a
big
interval,
no
response
is
being
sent
to
the
client,
and
the
total
number
of
(non
unique)
messages
received
on
the
the
client
is
waiting
for
data,
thus
occupying
a
thread.
This
client
side.
Figure
4
shows
the
mean
number
of
received
makes
it
hard
for
other
clients
to
reconnect
and
get
new
data,
19
Pull,,
interval
15
s
5.3
Network
Performance
35
i
30
As
we
mentioned
in
Section
2.2,
in
a
pure
pull
system,
the
25
l
pulling
frequency
has
to
be
high
to
achieve
high
data
accu-
E20
racy
and
data
freshness.
If
the
frequency
is
higher
than
the
15
data
generation
interval,
the
pull
client
will
pull
the
same
10-
l
ldata
more
than
once,
leading
to
some
overhead.
s
g
I
|
|
I
In
Figure
4
we
see
that
with
a
ublish
interval
of
50
sec-
200
1
111
i
*illtL
[
onds,
pull
clients
receive
approximately
35
messages,
while
100
200
350 500
1000
we
published
only
10.
In
the
same
figure
we
see
that
Push
#0fdofleNts
clients
received
approximately
a
maximum
of
10
messages.
IN
5I
2
This
means
that,
more
than
2/3
of
total
number
of
pull
re-
Ptuh
quests
were
unnecessary.
Furthermore,
we
see
that
the
num-
10
X
|
|
l
1l
| |
|
|
l
|
g
l
ber
of
packages
received
does
not
depend
on
the
number
of
clients.
8
1
q
|
|
||
|
| |
i
|
0
i
|
If
we
look
at
Push
graph
in
Figure
5,
we
notice
that as
the
I
6
E
li
number
of
users
increase,
not
all
clients
receive
all
10
mes-
4
4
l
|
l
|
111
11|
|
|
| |
|
I
sages.
The
number
of
correctly
received
messages
is
quite
I
l
well
with
100
users,
but,
unlike
the
pure
pull
approach,
it
be-
4
1
1l
1ll
1
iil
illii
ii|
ii
gins
to
degrade
as
the
users
increase.
This
shows
that
Jetty's
100
200
350
500
1000
Cometd
implementation
is
not
stable
and
scalable
enough.
of
ilihntS
1I
1
o2
5.4.
Data
Misses
Figure
4.
Mean
Number
of
Received
Publish
Items.
According
to
Figure
5,
if
the
publish
interval
is
20
or
50
(i.e.,
larger
than
the
pull
interval),
the
client
receives
all
the
mes-
which
increases
the
triptime.
With
an
interval
of
5
seconds,
sages.
However
as
we
have
discussed
in
the
previous
subsec-
the
triptime
is
lower.
This
is
because
the
clients
are
quickly
tion,
this
will
generate
an
unnecessary
number
of
messages.
receiving
responses
and
disconnecting.
This
makes
some
Looking
at
the
figure
again,
we
see
that
when
the
pull
inter-
threads
available,
which
makes
it
possible
for
other
clients
val
is
smaller
than
the
publish
interval,
the
clients
will
miss
to
connect.
some
updates,
regardless
of
the
number
of
users.
So,
with
the
pull
approach,
we
need
to
know
the
exact
publish
interval.
However
the
publish
interval
tends
to
change,
which
makes
it
difficult
for
a
pure
pull
implementation.
With
push,
when
5.2.
Server
Performance
the
number
of
clients
is
small,
the
client
will
receive
all
the
messages.
However
if
the
number
of
clients
increases,
and
the
publish
interval
is
large,
some
data
loss
starts
to
occur.
One
of
the
main
issues
of
all
distributed
systems
and
in
par-
This
is
again
due
to
high
number
of
idle
threads,
which
af-
ticular
that
of
web-based
applications
is
scalability
and
per-
fects
the
server
performance.
formance.
As
it
is
depicted
in
Figure
3,
the
pull
style
has
a
much
better
performance
compared
to
push
and
this
is
valid
5.5.
Threats
to
Validity
even
for
small
number
of
users
(e.g.,
100).
With
push,
when
the
number
of
clients
is
increased
to
350,
the
server
is
prac-
We
use
several
tools
to
obtain
the
data.
The
shortcomings
tically
saturated,
i.e.,
CPU
is
running
at
almost
100%.
This
and
the
problems
of
the
tools
themselves
can
have
an
effect
is
mainly
due
to
the
fact
that
the
push
server
has
to
maintain
on
the
outcome.
In
addition,
implementation
issues
in
the
all
the
state
information
about
the
clients
and
also
manage
application
server
Jetty
6.1.2
might
lead
to
the
high
CPU
us-
the
corresponding
threads
and
connections.
A
push
server
age.
based
on
long
polling
also
needs
to
generate
numerous
re-
Another
threat
is
the
pull
interval.
We
use
only
1
pull
quest/response
cycles
to
keep
the
connection
alive,
which
interval,
namely
15
seconds.
Different
pull
intervals
might
impact
the
resources.
With
pull
only
the
publish
interval
has
have
an
influence
on
the
performance
of
the
server
and
the
a
direct
measurable
effect
on
the
performance.
This
shows
data
coherence.
us
that
if
we
want
to
use
a
push
implementation
even
for
a
Clients
can
also
have
different
environments
(i.e.:
the
couple
of
hundreds
of
users,
some
load
balancing
solution
browser
they
use,
the
bandwidth
they
have,
etc.).
This
can
and
multiple
servers
are
needed.
have
an
influence
on
the
triptime
variable.
In
order
to
avoid
20
PuLl,
interval
15
S
owever,
the
white-paper
does
not
mention
possible
issues
10
with
this
push
approach
such
as
scalability
and
performance.
Khare
and
Taylor
[11]
propose
a
push
approach
called
AR-
E
RESTED.
Their
asynchronous
extension
of
REST,
called
e
I I I
I1
II
11
I I
IIIiIII
ilI111
A+REST,
allows
the
server
to
broadcast
notifications
of
its
c::
1
|
61
1
1
1
I|
ll
I
|
state
changes.
The
authors
note
that
this
is
a
significant
im-
*
3ll
l l
iI
I
j j
l
i
l
ll
i
i|
ll
plementation
challenge
across
the
public
Internet.
E
l
Ill
IIl
II
I||The
research
of
Acharya
et
al.
[1]
focuses
on
finding
a
0
_g6
E;
g
JYEHE
gg
3-Xl66
II
IIYwYlK11
IIYXYWENY
balance
between
push
and
pull
by
investigating
techniques
~
Of
clients
that
can
enhance
the
performance
and
scalability
of
the
sys-
tem.
According
to
the
research,
if
the
server
is
lightly
loaded,
[0
5
*11l1
0
15
20i
m0
501
*
P52th
Opull
seems
to
be
the
best
strategy.
In
this
case,
all
requests
get
queued
and
are
serviced
much
faster
than
the
average
latency
9
@
1
11
1l
1
l 1
|
I
|
|
|
I
of
publishing.
The
study
is
not
focused
on
HTTP.
Bhide
et
al.
[3]
also
try
to
find
a
balance
between
push
7~
j|
and
pull,
and
present
two
dynamic
adaptive
algorithms:
Push
j
El||||
|
1I1
1I
and
Pull
(PaP),
and
Push
or
Pull
(PoP).
According
to
their
C24
4
I I
11
1
1 I
11
I
111
1
1
1
1
1
results,
both
algorithms
perform
better
than
pure
pull
or
push
j
.
;
approaches.
Even
though
they
use
HTTP
as
messaging
pro-
E
|
tocol,
they
use
custom
proxies,
clients,
and
servers.
They
do
100
200
350
500
1000
not
address
the
limitations
of
browsers
nor
do
they
perform
d
of
clients
load
testing
with
high
number
of
users.
ElO
I
1
O5
Hauswirth
and
Jazayeri
[8]
introduce
a
component
and
communication
model
for
push
systems.
They
identify
com-
Figure
5.
Mean
Number
of
Received
Unique
Pub-
ponents
used
in
most
Publish/Subscribe
implementations.
lish
Items.
The
paper
mentions
possible
problems
with
scalability,
and
emphasizes
the
necessity
of
a
specialized,
distributed,
broad-
that,
we
used
the
same
test-script
in
all
the
simulated
clients
casting
infrastructure.
and
allocated
the
same
bandwidth.
Eugster
et
al.
[4]
compare
many
variants
of
Pub-
The
time
can
also
be
a
threat
to
validity.
To
measure
the
lish/Subscribe
schemes.
They
identify
three
alternatives:
trip-time,
the
difference
between
the
data
creation
date
and
topic-based,
content-based,
and
type-based.
The
paper
also
data
receipt
date
is
calculated.
However
if
the
time
on
the
mentions
several
implementation
issues,
such
as
events,
server
and
the
clients
are
different,
this
might
give
a
false
transmission
media
and
qualities
of
service,
but
again
the
trip
time.
In
order
to
prevent
this
problem,
we
made
sure
main
focus
is
not
on
web-based
applications.
that
the
time
on
server
and
client
machines
are
synchronized
Flatin
[12]
compares
push
and
pull
from
the
perspec-
by
using
the
same
time
server.
tive
of
network
management.
The
paper
mentions
the
pub-
We
measure
the
data
coherence
by
taking
the
trip
time.
lish/subscribe
paradigm and
how
it
can
be
used
to
conserve
However,
the
data
itself
must
be
'correct',
i.e.,
the
received
network
bandwidth
as
well
as
CPU
time
on
the
manage-
data
must
be
the
same
data
that
has
been
sent
by
the
server.
ment
station.suggests
the
'dynamic
document'
solution
of
We
rely
on
HTTP
in
order
to
achieve
this
"data
correctness".
Netscape
[15],
but
also
a
'position
swapping'
approach
in
However,
additional
experiments
must
include
a
self
check
which
each
party
can
both
act
as
a
client
and
a
server.
This
to
ensure
this
requirement.
solution,
however,
is
not
applicable
to
web
browsers.
Mak-
ing
a
browser
act
like
a
server
is
not
trivial
and
it
induces
security
issues.
6.*
Related
Work
As
far
as
we
know,
there
has
been
no
empirical
study
con-
There
ae
aducted
to
find
out
the
actual
trade-offs
of
applying
pull/push
There
are
a
number
of
papers
that
discuss
server-initiated
On
browser-based
or
AJAX
applications.
events,
known
as
push,
however,
most
of
them
focus
on
client/server
distributed
systems
and
non
HTTP
multimedia
streaming
or
multi-casting
with
a
single
publisher
[1,
9,
6,
7.
Conclusion
2,
19].
The
only
work
that
focuses
on
AJAX
is
the
white-
paper
of
Khare
[10].
Khare
discusses
the
limits
of
the
pull
In
this
paper
we
have
compared
pull
and
push
solutions
for
approach
for
certain
AJAX
applications
and
mentions
sev-
achieving
web-based
real
time
event
notification.
The
con-
eral
use
cases
where
a
push
application
is
much
more
suited.
tributions
of
this
paper
include
the
experimental
design,
a
21
reusable
implementation
of
a
sample
application
in
push
and
[4]
P.
T.
Eugster,
P.
A.
Felber,
R.
Guerraoui,
and
A.-M.
Kermar-
pull
style
as
well
as
a
measurement
framework,
and
the
ex-
rec.
The
many
faces
of
publish/subscribe.
ACM
Comput.
perimental
results.
Surv.,
35(2):114-131,
2003.
Our
experiment
shows
that
if
we
want
high
data
coher-
[5]
R.
T.
Fielding
and
R.
N.
Taylor.
Principled
design
of
the
mod-
ern
web
architecture.
ACM
Trans.
Inter
Tech.,
2(2):1
15-150,
ence
and
high
network
performance,
we
should
choose
the
2002.
push
approach.
However,
push
brings
some
scalability
is-
[6]
M.
Franklin
and
S.
Zdonik.
data
in
your
face:
push
technol-
sues;
the
server
application
CPU
usage
is
7
times
higher
as
ogy
in
perspective.
In
SIGMOD
'98:
Proceedings
of
the
1998
in
pull.
According
to
our
results,
the
server
starts
to
saturate
ACM
SIGMOD
international
conference
on
Management
of
at
350-500
users.
For
larger
number
of
users,
load
balancing
data,
pages
516-519.
ACM
Press,
1998.
and
server
clustering
techniques
are
unavoidable.
[7]
J.
Garrett.
Ajax:
A
new
approach
to
web
applications.
Adap-
With
the
pull
approach,
achieving
total
data
coherence
tive
Path:
http://adaptivepath.com/publications/
with
high
network
performance
is
very
difficult.
If
the
pull
essays/archives/000385.php,
2005.
winterval
highnertthan
therf
publish
inervadiful,
.
s
e
data
ms
[8]
M.
Hauswirth
and
M.
Jazayeri.
A
component
and
commu-
interval
ishgemhntepbihnevlnication
model
for
push
systems.
In
ESEC/FSE
'99,
pages
will
occur.
If
it
is
lower,
network
performance
will
suffer.
20-38.
Springer-Verlag,
1999.
Pull
performs
well
only
if
the
pull
interval
equals
to
publish
[9]
K.
Juvva
and
R.
Rajkumar.
A
real-time
push-pull
communi-
interval.
However,
in
order
to
achieve
that,
we
need
to
know
cations
model
for
distributed
real-time
and
multimedia
sys-
the
exact
publish
interval
beforehand.
However,
the
publish
tems.
Technical
Report
CMU-CS-99-107,
School
of
Com-
interval
is
rarely
static
and
predictable.
This
makes
pull
use-
puter
Science,
Carnegie
Mellon
University,
January
1999.
ful
only
in
situations
where
the
data
is
published
frequently
[10]
R.
Khare.
Beyond
Ajax:
Accelerating
web
applications
with
real-time
event
notification.
Knownow.com,
white-paper.
according
to
some
pattern.
[11]
R.
Khare
and
R.
N.
Taylor.
Extending
the
representational
These
results
allow
engineers
to
make
rational
decisions
state
transfer
(REST)
architectural
style
for
decentralized
sys-
concerning
key
parameters
such
as
pull
and
push
intervals,
in
tems.
In
ICSE
'04:
Proceedings
of
the
26th
International
relation
to,
e.g.,
the
anticipated
number
of
clients.
Further-
Conference
on
Software
Engineering,
pages
428-437.
IEEE
more,
the
experimental
design
allows
them
to
repeat
similar
Computer
Society,
2004.
measurements
for
their
own
(existing
or
to
be
developed)
ap-
[12]
J.-P.
Martin-Flatin.
Push
vs.
pull
in
web-based
network
man-
plications.
agement.
http://arxiv.org/pdf/cs/9811027,
1999.
[13]
A.
Mesbah
and
A.
van
Deursen.
An
architectural
style
Our
future
work
includes
adopihng
a
hybrid
approach
that
for
Ajax.
In
WICSA
'07:
Proceedings
of
the
6th
Working
combainthes
pullfand
pu
botechniquroaes.r
A
applsoicatiton
IEEE/IFIP
Conference
on
Software
Architecture,
pages
44-
to
gain
the
benefits
of
both
approaches.
We
also
intend
to
53.
IEEE
Computer
Society,
2007.
extend
our
testing
experiments
to
different
versions
of
Jetty
[14]
A.
Mesbah
and
A.
van
Deursen.
Migrating
multi-page
and
alternative
push
server
implementations,
for
example
web
applications
to
single-page
Ajax
interfaces.
In
CSMR
ones
that
are
based
on
holding
a
permanent
connection
(e.g.,
'07:
Proceedings
of
the
11th
European
Conference
on
Soft-
Lightstreamer15)
as
opposed
to
the
long
polling
approach
ware
Maintenance
and
Reengineering,
pages
181-190.
IEEE
discussed
in
this
paper.
Additional
experiments
with
a
va-
Computer
Society,
2007.
riety
of
pull
intervals
are
also
desired.
[15]
Netscape.
An
exploration
of
dynamic
documents.
http://
wp.
netscape.
com/assist/net_sites/pushpull
html,
1996.
Acknowledgments
Partial
support
was
received
from
Sen-
[16]
A.
Russell.
Comet:
Low
latency
data
for
the
browser.
http:
terNovem,
project
Single
Page
Computer
Interaction
(SPCI),
//alex.dojotoolkit.
org/?p=545.
in
collaboration
with
Backbase.
[17]
A.
Russell,
G.
Wilkins,
and
D.
Davis.
Bayeux
-
a
JSON
protocol
for
publish/subscribe
event
delivery
protocol
0.1draft3.
http://svn.xantus.org/shortbus/trunk/
References
bayeux/bayeux.html,2007.
[18]
R.
Srinivasan,
C.
Liang,
and
K.
Ramamritham.
Maintain-
[1]
S.
Acharya,
M.
Franklin,
and
S.
Zdonik.
Balancing
push
and
ing
temporal
coherency
of
virtual
data
warehouses.
In
RTSS
pull
for
data
broadcast.
In
SIGMOD
'97:
Proceedings of
the
'98:
Proceedings
of
the
IEEE
Real-Time
Systems
Symposium,
1997
ACM
SIGMOD
international
conference
on
Manage-
page
60.
IEEE
Computer
Society,
1998.
ment
of
data,
pages
183-194.
ACM
Press,
1997.
[19]
V.
Trecordi
and
G.
Verticale.
An
architecture
for
effective
[2]
M.
Ammar,
K.
Almeroth,
R.
Clark,
and
Z.
Fei.
Multicast
push/pull
web
surfing.
In
2000
IEEE
International
Confer-
delivery
of
web
pages
or
how
to
make
web
servers
pushy.
ence
on
Communications,
volume
2,
pages
1159-1163,
2000.
Workshop
on
Internet
Server
Performance,
1998.
[20]
M.
Welsh, D.
Culler,
and
E.
Brewer.
Seda:
an
architecture
for
[3]
M.
Bhide,
P.
Deolasee,
A.
Katkar,
A.
Panchbudhe,
K.
Ra-
well-conditioned,
scalable
internet
services.
SIGOPS
Oper
mamritham,
and
P.
Shenoy.
Adaptive
push-pull:
Disseminat-
Syst.
Rev.,
35(5):230-243,
2001.
ing
dynamic
web
data.
IEEE
Trans.
Comput.,
5
1(6):652-668,
[21]
M.
Welsh
and
D.
E.
Culler.
Adaptive
overload
control
for
2002.
busy
internet
servers.
In
USENIX
Symposium
on
Internet
Technologies
and
Systems,
2003.
15
http:
//www.1lightstreamer
.corn
22